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i. INTRODUCTION 


As a young and developing information system organiza- 


tion, the Indonesian Arany Data Collecting and Processing 


Service (DISPULLAHTAT) has a tremendous proliferaticn of 
appa Gaelon files. It is no surprise that there ismamem 
redundancy of data and efforts. Data redundancy rxastes a 


limited resources, and furthermore, 1t raises the problen of 
inconsistent data, that is, the same element of data having 
different values within different files. The implementaticnr 
of datakase management system (DBMS) could handle this 
Froblem by providing more controi and nore effective 
Management of data. 

Cn one hand, tke information generated electronically 
becomes more and more in demand to the point where it has 
become a critical issue for the Indonesian Army. Cr the 
other hand, the personnel generating and maintaining this 
information move dynamically because of reguirements for 
Military tour of duty and tour of area. ThiS SitWaueeg 
creates problems in keeping accurate and up to date infcrma- 
tO Tre Even though the manual documentation is always done 
properly, this 1S not always adequate. It is often the case 
that many applications are hignly dependent up on the 
personnel responsible for such applications. Standardized 
and centralized documentation is a "must", especially when a 
CBMS 1s implemented. In this regard, the data dictionary is 
a powerful vehicle that supports such documentation. 

According to Dolk [ker 1a "a data dictionary fsa 
collection of an enterprise's meta-data designed aS one or 
more dataLbases which can be retrieved and analyzed using 
Standard database management system capabilities". T have 


will Fe discussed further ina subsequent chapter, and will 


be considered as a kasis in choosing the most arpropriate 
DBMS to ke implemented by DISPULLAHATAD. 

The organizational structure of DISPULLAHTAD, its systen 
Sonciguration, and itS Current various applications will be 
descrites briefly in order to provide a background for the 
succeeding chapters. A discussion of database maragement 
system and a recommendation Of the most appropriate DBHS to 


ke implemented comprises the last chapter. 


IT. LHEDCNESIAN ARMY LATA COLLECTING AND PROCESSING SERViGe 


ee Se a SS SS SS ee See SS = eS Se Se SS SSS = = = =— a ee cee ee Se Se ee 


Ae ORGANIZATION, TASK, AND SYSTEM CONFIGURATION 


DISPULLAHTAD is an acronym in the Indonesian language 
that stands for the Indonesian Army Data Collecting and 
Processing Service. It waS initiated in Fiscal Year 
1973/1974 and formally organized in Fiscal Year 19/57i37e 
Ren. 24. DISPULLAPIAD is located in the Indonesian Arny 
Headguarter - Jakarta, the capital city of the Rerubigem@e. 
Indone€ésia. 

The DISPULLAHTAD's main task is to provide all infcrna- 
ton processed electronically for. aun organizationa 
elements of the Army requiring the information [Ref. 2]. In 
order to be able to accomplish this task, DISPULLAHTAD is 
equipped with several computer configurations. As of 1984, 
these include an IBM System 4341, an IBM System 370, severai 
IBM System 3740s, and several TRS-80s. 

In the lower organizational level such as Military Area 
Commands (KODAM), Army Finance Service (JANKUAD), ALMgyV 
Administrative and Personnel Service (JANMINPERSAD), Aray 
Tevelopment and Educational Command (KOBANGDIKLAT), eétc., 
each has its own Data Processing Service and it is called as 
PULLAHTA KOTAMA/LAKPGS or PULLAHTA £or Short. A simplified 
organizational structure of the Indonesian Army is presented 
as figure 2.1 and the position of both DISPULLAHTAD and 
PULLAHTAs can be clearly visualized. 

Each PULLAHTA has two roles: first to process Yaga 
provide all pertinent information requested by the crganiza- 


tion to which it is attached, and secondly, to crovide data 
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Figure 2.1 Indonesian Army Organizational Structure 
(Simplified Chart). 
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PFULLAHTA is also equirped with some hardware as shown below. 
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1. EULLAHTAS insidewiava latana 


FULLAHTAS that belong to Military Area Commard ir 
Java island are equipped with an IBM System 4331 connected 
to the IBM Systen 4341 at DISPOULLAHTAD in "oniine" mode via 
dedicated public telephone lines. This is the first stage in 
networking all BPULLAHTAS throughout thenwcomen & The data 
interchange is done electronically through these dedicated 


fines. 


2. FEULLAETAS outside Java Isiand 


PULLAATAS sited outside Java island are equiprped 
with an IBM System 3740 and currently work in “off line" 
mode. Eventuaily they will ke connected to DISPULLAHTAD via 
dedicated public telephone lines. The data interchange is 
done manually using floppy-disks transported via airline and 
it requires one to two Gays for the data to reach its 


destination. 


Be. APPEICATIONS 


All applications done by DISPULLAHTAD are in crder to 
fulfill ats task of electronically providing infopmacaia 
needed by the Indonesian Army. These applications are sefa- 
rated into three kinds of Management Information Systems 
fRef. 3] : 


1. Administraticn Management Information Syvsten 


This category includes applications in “£2 mwance? 
logistics, personnel, and all applications pertinen Gaia 


Gevelopment, educaticn, and corps/specialty. 


2. Military Management Information System 


The applicatzonsson intelligence ard security, 
territorial, communication and electronics, and organiza- 


tion, operation, and training are included in this cacagonms 


ie 


Peaeetanning sand »~Controliing Management ZInfcrmation 


a ee ee ee = SS eae = SE es SS ee aS SS 


system 


Three kinds cf applications are included in this 
Eategory: planning and budgeting, auditing and controi, and 


Semmand, control, and communication. 


C. DATA REDUNDANCY FROBLEMS 


There are many possible data redundancies within those 
applications. For instance consider data about name, rank, 
COLES, OCCUpation, c¢ctc., that belong to an individual 
assigned as Intelligence Officer ina Territorial Unit. His 
data will appear in at least four different files: 
personnel file, payrcll file, intelligence file, and terri- 
torial file. If, fer instance, there is a change to just 
cne of those data elements, redundant effort 1S required te 
update all those four files and a high levei cf data 
inconsistency may result. 

It happens many times that top level management detects 
data inconsistency in two different reports generated by 
DISPULLAHTAD (e.g. the total number of personnel-apfears 
differently in the personnel report and the payroili repert). 
This has become very annoying, and tremendously reduces 
credibility in the cemputer system. In this regard, ar 
effort should be made to eliminate the problem and one way 
of doing that is ky designing and implementing a data 


dictionary system (DDS) in concert with a DBMS. 


De. STAGED DEVELOPMENT APPROACH 


The design of any information system is the most diffi- 
emit and critical step. It should be done with great care 
and full awareness. As suggested by Sprague and Carlson 
(Ref. 4], there are three different approaches in initiating 


an information systen: 


* Quick=Hit appreach- 

This approach should be done if there iS No Ciaraeueae 
tion wkether such an information system is needed or not, 
but there is a recognized high payoff for initiating =ime 
system. This approach requires deveioping the system in the 
most beneficial area, capturing the benefits, and tnen 
considering what to dc next. 

.® Staged Development approach. 

This approach is done by developing the system in the 
most keneficial area as in the quick-hit approach, [ut )yaeee 
some advanced and clear planning. Therefore, part of the 
effort in developing the first system can be reused in 
develcping the second. This approach is very appropriate for 
initiating an information system clearly supported by top 
Management, but with limited available resources. 

e Complete System approach. 

This approach requires the longest development time and 
highest development ccsts before any benefits are attained. 
Before kEuilding any part of the system, a full-service 
system generator and _ the organizational structure ieee 
Managing it must be developed first. In this regard, jes 
apprcach represents the most risky option. 

For DISPULLAHTAD, it is anticipated that top management 
will strongly support the implementation of a DBMS, (themes 
fore the Quick-Hit approach is not necessary. On the cthker 
hand, limited computer resources makes a Compiete System 
appreach infeasible, too. Hence, the most appropriate 
approach is the Staged Development approach. In the Staged 
Development approach, identifying the functional area where 
there 1S a reason to expect the highest pay-off in starting 
the project is a crucial thing. Using the hignest volume of 
transaction as the criteria, and by evaluating the trans- 


actional data gathered between April 1983 and December 1983 
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(see Figure 2.2), thkere are three applications having higa 
transactional volume: fersonnel, payroll, and finance report 
applications. Only the personnel and payroll applications 
have a master file which is maintained and used continu- 
cusly. Besides that, these two applications are tne most 
crucial in maintaining personnel morale and the tsost often 
used in relation to the personnel management task. 

Based on these evaluations, data used by the perscnnel 
and payrcll applicaticns will be the first database to be 
implemented by DISFULLAHTAD. These applications include 
personnel, payroll, intelligence personnel, and territorial 
perscnnel. It is also implied that the discussion cn the 
DDS and CBMS will be limited to those applications. 

The plan for this staged development approach are: 

1. Initial design of DD covering personnel, payroll, 
intelligence personnel, and territorial personnel applica- 
tL0nS< 

Z2- Implementation of personnel and payroll database. 

3. Design of DD for all data used by the rest cf appli- 
cations excluded at the first stage. 


GY. Implementation of other databases. 
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fit. DATA DICTIONARY 


A. GENERAL 


The revolutionary change in computer tecknolcgy has 
created another challenge on how to organize and manage the 
very large-scale datakases made possible by the combination 
of datatase management systems (DBMS) and powerful new hard- 
ware systems. The need to control the enterprise's data 
becomes critical due to the proliferation of microcomputers 
that trigger more and more applications which ir turn 
creates redundancy and data inconsistency problems. At the 
same time, the number of microcomputer users demanding 
direct access to the enterprise data is also increasing. 
This direct access to large and complex databases again 
creates a problem of how to "coordinate" and contrel these 
complex information structures. 

Data redundancy, data inconsistency, and the heed to 
control the enterprise's data lead to the design anc imple- 
mentation of database systems. The database envircnment 
itself assumes an architectural plan designed to minimize 
redundancy and to emphasize accessibility. It assumes 
logical and physical structures aimed at separate otjec- 
tives. It also assumes that individual file may serve many 
different applicaticns. All of this is far too much 
complexity to be managed without precise and up-to-date 
documentation and control. The data dictionary is designed 
to define all appropriate aspects of the enterprise's data, 
Somenat it can be used as a tool to control and manage the 
database system no matter how great its size and how complex 


mes structures. 


Ae, 


B. INFCRMATION RESOURCE MANAGESENT (aa 


The concept of IEM€ is that information is a vital enter= 
prise asset that should be invested in, and used like cther 
resources [Ref. 5]. 

IRM is the task cf managing information resources such 
as data, processes, users, software, and hardware in an 
integrated and coordinated manner. IRM includes all manage- 
ment aspects of the information-related operations of an 
organization, such as policy formulation, resource all@éa= 
tion, implementation, andmeontre 

A definition of IFM was formuiated at a workshop on Data 
Dictionary Systems and Information Resource Management spon- 
sored by the Association for Computing Machinery and the 


National Bureau of Standards in 1980: 


Information Resource Management, is whatever p li Gye 
action, or procedure concerning information (both autee 
mated and non-autcmated) WhlIe management establishes 
that serve the overall current and future needs of the 
enterprise. Such policies, etc., would include consid. 
eraticns of availalility, timeliness, accuracy, 22veee 
rity, privacy, security, auditability, ownersii3g, ee 
and cost-effectiveress. 


This definition cf IRM was chosen to emphasize the 
enterprise-wide nature of planning and execution of infcrma- 
tion policies, acticns, and procedures in order thats 
can be treated as a true resource. It also reflects the 
primary shift of data processing uses from processing- 
centered design methodologies te data-centered 


methodologies. 


1. Data Dictionary as the Tool von. 





Cne of the froblems encountered in IPM is the vast 
amount cf data abcut information resources Trequilecqueees 
managed the enterprise data, together with the very cofolex 


and numerous relationships existing between then. This ae 
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pPRECCISclyerhe sort) of task that a Data Dictionary System can 
be made to do, provided that it has been conditioned to know 
how to deal not just with data entities or process entities, 


but with the entire range of information resource entities. 
2. Data Dictionary 


A data dicticnary is a collection of meta-data (data 
about the enterprise data) that could consist of: the name 
of data (including its synonyms and/or homonyms), the loca- 
tion of data, a description of the meaning of aata, the 
relation between data, how the data is used, who is respon- 
Sible for the data, the source of the data, etc., in short, 
a store cf all the apfropriate information about the data. 

Recently, there has been a trend towards using data 
Spetionary to include the following functions: 


Ae Detinveren Of: Other datas, Constructs such a 


U) 


records and files. 

be Definiticn of processes such as programs of 
Manual processes. 

c. Defihiticn of data users whether individuals or 
Smganizatioral entities. . 

Along with these definitions, the data dictionary 
also began to be used to document the cross-references 
between them and to record their usage and organizational 


responsikilities. 
@. Fata Dictionary System (DDS) 


A Data Dictionary System is a combination of soft- 
ware and procedures that aid an enterprise in setting uf and 
Maintaining its complex structure of data resources. The 
software itself may ke produced in-house or acguired fron 
software vendors. Fcr the following three reasons, teks 


often better to purchase instead of building it in-house: 


Lies) 


ae Design and Implenentation 


The task of designing and implementing a DDS, 
even cne of modest functionality, is definitely a Tone 
trivial cne. There exists good potential that the magnitude 
of the task will te underestimated and that greater 


resources will be needed than those originally estimated. 


L. Gaining a Success 


Tf the use of DDS is to be at all successful, 
the software itself must conform to high standards ion 
quality assurance. The use of the same software at many 
other installations, as is the case with a commercial 
package, aids in the early discovery of possible software 


errors and their corrections. 
c. Technology Progress 


There are good reasons for assuming that ODDS 
technology will continue to progress and that substantial 
enhancements will significantly increase the usefulness and 
value of the DDS, and it will be difficult for an injioee 
system to keep pace. | 

There are several DDS commercially available 
now, such as DB/DC Data Dictionary of IBM, DATAMANAGER Ly 
MSP Inc., Integrated Data Dictionary (I1DD) by Cuda 
Software Inc., DATADICTIONARY by Applied Data Research sina 
Extended Data Dictionary (XDD) by Intell Systems Con pl) vee 
TEN by University Computing Company, and Data Contrcle= 7 ae 
(PCS) by Cincom Systeus ine, 

In the following section, features that one 
could expect tc find on those available DDS wii 
discussed. It must Ee pointed out that Rone or the avauae 
able DDSs mentioned aktove will necessary have all explained 
features, and there is no such implication that all ofetpees 


features are required in all DDS applieaecrer 
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Soe FEATURES OF “A DATA DICTIONARY 


There are several issues wili be discussed here, such as 
Tmplerertation and Architectural, Data Dictionary (DD) and 
Doeeschema, Extensibility Facilities, Status Facilities, 
Dicticnary Commands, Eridge Facilities, and Data Dicticnary 


System Security. 


1. Implementation and Architectural Issues 


>_> = ae ie a ee ee = —_ a —— wa a see ce a SG SS ae Ew aS SS Se 


a. The Relationship Between DDS and DBMS 


The primary purpose of a DBMS is to Manage data, 
whereas the primary turpose of a DDS is to manage meta-date. 
Therefore, it is clear that-there is a very little overlap 
between these two, in fact they are complementary; both 
functicns are required for proper management of infcrmation 
resources. 

Some of the functions a DDS will perform are in 
Support cf one or more DBMSs. This is to be expected as the 
DDS will manage all meta-data, including meta-data where the 
actual instances of data are stored ina database, which in 
turn is being managed by a’ DBMS. An element reguired for 
this latter function, the DBMS's management of data in data- 
bases, 1s the knowledge on the part of the DBMS of certain 
meta-data of the dataktases which are required by the DBMS in 
Seder fcr it to doits processing. - This meta-data is 
commonly referred to as the DBMS-Directory, and it should be 
clear that this potentially is one area of overlap between 
the DPS and DBMS. In this sense, it is preferable to design 
a DDS prior to the DEFS implementation rather than to build 
Mmepos that has to be “fitted toward an existing DBMS. This 
reason together with an existing method of implementing a 
DDS as one of the LEMS's applications may explain why in 
this thesis designing the dictionary is done prior te the 
design of the databases. 


a 


F. The Method of DDS Inplementation 


It 1s preferable to design a data dicticnrary 
Prior to the implementation of the DBMS. On the otherwise 
the implementation of that designed data dictionary as a 
complete DDS is a gcod candidate to be one of the wpe 
applications, and indeed a number of existing DDS are inple- 
mented in this manner. But this 1s not a single option yea 
implementation of DDS can be either :; 

e DBMS-dererdent systen. 

This is a DDS that uses a DBMS in its igplemen- 


Eau en: 


e Free-standing systen. 

A DDS that doesn't use a DBHS in its iapliementa- 
tion is considered to be included in this category. 

There is no ultimate answer as to whether a 
free-standing or DBMS-dependent DDS is tne best. There are 
both pros and cons te this cuestion, and these depend or the 
enterprise's specialized circumstances, such as whether the 
enterprise implements a DBMS or not, and whether it uses 
multiple DBMSS ofr a single Deus ere: its databases. 
Enterprise (s) that have ro intention to implement a DEMS Eut 
DDS will give a favor toward a free-standing system. On the 
cther hand, the enterprise(s) having multiple DEMS nay 
implement a DBMS-derendent system and choose ohe Ci aaa. 
DBMSs to implement it; or, they may implement a_ free- 
standing system in crder to provide more flexible and fair 
Control: 

Other ccnsiderations include the DDS security 
and a view that the scope of DDS usage is substantially 
broader than the DBMS envirorment. With a DBMS-dependent 
System, personnel familiar with the use of the DBMS may find 
it easier to break the DDS security than would be the case 


with a free-standing svsten. 
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For the sake of corpleteness, there is ancthker 
concept of DDS implementation method rererred to as inte- 
grated DDS. This method offers an elimination of overlap 
ketween data dicticnary and DBMS-Directory by combining 
these features into cne. The advantage gained by combining 
these two features is that redundancy of storing the 


meta-data is eliminated. 
Cc. Active and Passive DDS 


In the processes that reguire meta-data for its 
executicn, there shculd be a command or series of ccammands, 
representing some CDS functionality that produces the 
recuired meta-data. This functionality is called dictionary 
interface, and there are two kinds of such interface: active 
and passive interface. 

An active interface means that all processes 
that recuire meta-data will use the gaost current neta-data 
mmmeche data dictionary. Similarly, all processes which in 
the course of their execution generate neta-data are 
reguired to store the generated meta-data in the data 
guotionary. . 

On the other hand, the passive interface will 
have all that an active interface has to do as the option. 
The other option for all processes that require meta-data in 
its execution are either retrieve it from data dictionary or 
some other locations; or if the process aiready contains the 
meta-data, there exist an option for the svstem to check 
whether or not this meta-data is the most current version in 
the data dictionary. In the case of generating a meta-data, 
the process also has an option to store or not to store the 
generated meta-data. 

Therefore, two ccnclusions can be drawn about 


the dictionary interface : 


a 


(1) A ODES may have sone interfaces which are 
active and others which are passive 

(2) The fact that an interface is active is 2 
Froperty not oniy of the DDS, but also of the overall system 


of which the DDS is a pau 


2. Lata Dictionary and Data Dictionary Schema 


Data Dictionary denotes the organized and structured 
collecticn of meta-data which comprises the contents of the 
DoS. The data dicticnary schema denotes the logical ©tmuee 
ture of the data dictionary, ina manner analogous tc the 
use of same term in the context of a DBMS. 

The structural characteristics of data dictionary 
and the contents of data dictionary schema will determine 
what kinds of meta-data can be stored in the data dicticnary 
and what kinds of relationsnips can be established cretween 
them. Scme systems have extensibility facilities wherety an 
installaticn can custcmize the data dictionary scheme. 

The schema is described in logical terms in order to 
gain a clearer insight about what kinds of meta-data are 
supported bv the DDS. This logical description wile 
course, be quite different from the manner in which these 
structures are actually implemented ina specific system. 
This description should be made independent cf any 
implementation. 

A data dictionary has a conceptual similarity tema 
Entity-Relationship-Attribute model. The basic unit in the 
data dictionary is a Eictionary Entity or Entity £08 Shem 
Entities represent real world objects or things about which 
certain information exists in the dataydtetvena. And the 
information about entities themself, exists in the form of 
Attrikutes which generally denote the qualities or guanti- 
ties of properties of the entities. Finally data 


dicticnary also contain information about Relationships 
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between entities, and relationships may have attributes 
assigned to then. 

The term Entity-type is applied to some entities 
that have similarities among them. For example, 1f a set of 
files is described in the data dictionary, each file will be 
Bepresented by a distinct entity. It then becomes useful to 
establish an Entity-type called File in the cata dictionary 
and to say that all such entities representing files have 
the entity-type File. Attributes of entities of the same 
type will exhibit a certain degree of Sami laetity. 
Entity-type File will likely have an attribute of what kind 
of access method used, and maybe another attribute shewing 
mene clocking factor used. These both access method ani 
blocking factor are then called as an Attribute-tyrpe which 
is associated with the entity-type File. Beside the entity- 
type File, there will be an entity-type Record. The infornma- 
tion would exist in the data dictionary explainina which 
types cf records are included maa pagaavel fie. All such 
relationships between these file entities and their associ- 
ated record entities, then be calied as a Relationshir-tyrfre. 
In conclusion, the data dictionary schema would be viewed as 
containing all existing entity-types, relationshipf-tvypes, 
and attribute-types. Any one of these three types may also 
ke referred to as a schema descriptor. 

Every entity in the data dictionary haS a primary 
Mame which, depending on the particular DDS, will te uwnigue 
either in the dictionary or within the entity-type to which 
the entity belongs. Some systems may facilitate duplicate 
user-Supplied names [Ty assigning them distinct sequential 
numbers. In this case, the concatenation of user-suprlied 
name and sequential number constitutes the unigue dictionary 
hame. The allowable length of the primary name should be 
sufficiently large enough to convey the meaning of an entity 


in its primary name. It is common that at ieast scme 
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entities will also be known by other names. Such alte 
hames are called aliases or synonyms, and the most important 
things are the capakility of the DDS for tracking@enetmae 
allowing access to the data dictionary via these alternate 
names. Sometimes it 1S convenient if non-unigue synonyms 
are allowed. To fulfill this requirement, DDSs have Eacmuee 
ties for tracking synonyms eitker aS attributes One 
. respective entities, or as separate entities related tc the 
primary entity. Therefore, it iS important that the systen 
Should be able to reccgnize the context in which the synoryn 
is used. 

The attributes can te differentiated into scme 
attribute-types, among which are: Description = it consists 
of an English language statement descriking the meaning of 
the entity. Classification keywords = these are attacied to 
the entities which then can be used for selective retrievai 
of these entities. Audit-attributes = these are attributes, 
generated by the DDS, indicating the identification cf the 
person who created, the date of creation, the identification 
of last person who mcdified, the date of last modification, 
and the total numbers of modification, all for eachJenvrae 

The entity itself can be conveniertly separated into 


three entitv-types: TData, Process, and Usage entity-tyves. 
ae Data Entity—-uyees 


The most common of this type, listed  welua 
typical attribute-tyres and relationship-types are: 

(1) ItemyData Element. In some systems the 
lowest entity-type is Item, which 1S considered to be the 
atone  anaet. in other systems, the lowest entity may be 
Data Flement, which in its turn it may contain other ypare 
Elements. This is usually specified by contains clause, 


Wnich expresses the relationship. 
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Commonly provided attribute-tyvpes relate 
to the physical characteristics of the Iten/irlenent, 
Mmiciuding distinctions between Source, Target, Internal 
representations, and the validation criteria that may be 
required for the real-world instances of the Iten/zlement. 

(2) Group/Record. Systems that recognize the 
Item aS an entity-type will contain Group as a_ separate 
entity-type, whereas systems that have Data Element as an 
entity-type do not have an entity-type for Groups. Record 
is logically the same as a Group, therefore separate entity- 
types for both of them may or may Mot.  -CXGUSt . A 
relationshir-type is provided to express tne structure of 
the Greurp/Record. Commonly available attribute-tyres relate 
to the manner in which the constituent elements are aligned, 
and other physical characteristics of the entity. 

(2) Ugesglie® Relationship-types are provided to 
express the structure of the file. Attribute-tvpes relate to 
the access method used, blocking and labelling initormation, 
UC. 

(4) DBMNS-related Entity-types. The entity- 
types that exist are dependent on the specific DBMS for 
which the DDS provides support services. in all cases, the 
entity-types equate tc the various data descriptions used by 
the DBMS, such as Schema, Subschema, Database Directory, 
etc. Relationship-tyres and attribute-types are provided to 
allow the DDS to express the structure of these entities. 

(5) Qther Data Entity-types. Some DDSs offer 
heport, Screen, and Form entity-types. In each case, 
relationship-types are provided that allow the contents of 
such entities to be specified in terms of the constituent 


elements. 


Za 


kE. Process Entity=t ypes 


There are two most common Process Entity-types. 
They are Program/Module and System/Subsystem as will be 
discussed below. 

(1) Program/Module. This entity-type [¢paaam 
sents irformation arout a collection of executable code. 
Typical attribute-tyres are the language of the source code, 
the size, and the characteristics under whicn it operates. 
Relationship-types are provided to other Programs/Modules, 
as well as the data, i.e. databases, files, and elements, on 
which ete operates. Generally speaking, diffegenan 
relationship-types are provided for input, output, and 
processing-in-place. 

(2) System/Subsysten. This entity-type 
descrites a collecticr of programs and/or Modules associated 
with a major function of the enterprise. Relationship-tyrpes 
are provided to associate a System with Subsystems, as well 


as the ccnstituent Prcgrams. 
c. Usage Entity-types 


Users and their organizational environment, and 
the data communicaticn environment can be thought of as 
Usage (Ck  =ternal) ~ Shiver —-tyecs. They are not directly 
components of a system, such as data and processes; eae 
nevertheless play an important role in its operation. 

The User and organizational component entity- 
tvpes may have relationship-types that allow users tc be 
associated with organizational components. These components 
themselves, and selected relationship-types that associate 
uSers or components with data and process entities may 
describe the responsibilities assigned to those users. 

The example of Data Communication envircnment 


entity-types are Terminals, Messages, and other entity-tyfes 


ena t describe the communication networks. Their 
relationship~types may provide the associations of such 
entities with other usage entities, i.e. a given terminal is 
assigned to a certain set of people or a certain crganiza- 
tlonal unit. Crete Rayeprovide such associations with data 
and process entities, 1.ée. a given terminal 15 authorized 
to execute only a given set of transaction programs, cr to 
access only certain Files or Databases. It will be aporo- 
priate tc note here, ‘that the role of the DDS in these 
Meters 15 Strictly a repository for documentation, and that 
the DDS by itself cannot be expected to enforce such 
Bestrictions and limitations. In order for DDS to be used 
for enforcement, appropriate "active" interfaces would have 
to exist to assure that the restrictions and controls waich 
are documented in the data dictionary are always invoked at 
execution time. It will, on the other hand, create more 


complexity and much overhead. 


PPeEeSXEenSilDlLity Facilities 


The concept of extensibility facilities is to ailow 
an installation to modify the system-standard schema as 
delivered by the DDS vendor. Any new schema descriptor 
created through the use of extensibility facilities will be 
referred tc as an extensibility descriptor. 

Extensibility facilities are extremely powerful, and 
their usage should be done with great care because exten- 
Sions to the system-standard schema, once they are used, can 
only be undone with some difficulties. Addai. nal. y 
changes to the system may create confusion among the users 
of the PDS as well, and decrease their confidence in the 
systen. Due to these reasons, it is recommended that their 
usage shculd be restricted to the Dictionary Administrator, 
the 


mice eecrscn Who is resronsible for DDS function, i.e., 
recording of ali meta-informations and meta-data and its 


be, 


Maintenance through theluse ot the pls, along with manape 
its facilities availatle to the users of the svsteag. 

There are three kinds cf such facilities that exist 
IDSCUELT SN eeu S See 

a. Entity-tyrpe extensibility: the abillty to adams 
entity-~types to the dictionary. 

b. Attrikute-type extensibility: new attribkute-tyrces 
for e€ither entity-types or trelationsn=e-2 ec. can be 
declared using this facile, 

c. Relationship-types extensibility: tnis facies 


allows the installaticn to declare new relationshir-types. 
4. Status Facilities 


These facilities allow the DDS to be used in a 
System Life Cycle environment where, for instance, a certair 
entity may both part orf a production system and a new test 
system. Due to its intended uSage, it is preferable to 


Maintain the same name for the entity in different stages. 


Therefore, a facility is reguired that will allow two 
distinct dictionary entities to have the same name, yet 
different attributes and relatiorships. In some systems, 


such entities can be distinguished by assigning different 
versicn humbers to then. in systems having a status 
facility, it can be accomplished either by: 

ae Appending the entity-status to the envi i oe 
which frovides the uniguerness of the name. 

k. Logically partitioning the dictionary intowsepas 
rate databases fcr different statuses, and rezvibieg 


uniqueness of the nare only within each partition. 


5. Dictionary Commands 


A DDS may have one or more interfaces that allow a 


user to interact with the dictionary. Such an interface may 


ieee we LTCLNewot 
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command language. 
’ SerCoCn=oOrlentea interface. 


’ PEC derOrbatenaten data entity facility. 


~~ Yt ES rT 


e ProgLdimachomimremrace — that allows wser written 
aeeeication programs to access the dictionary. 

A screen-oriented interface is more user-friendly 
compared to the others. It results in higher utilization of 
computer system resources, but makes the DDS availarle toa 
larger class of users. Another benefit mav Fe that the 
error rate is substantially reduced. 

The dictionary commands may be differentiated into 
€ight categories on the basis of their functionality: 

a. Dictionary Maintenance Commands: this enables an 
installation to create and maintain its data dictionary. 

rk. Reports and Queries Commands: an installation 
may generate reports using meta-data contairec within the 
data dictionary using these commands. 

c. Data Structure Interface Commands: enable cther 
systems to use the descriptions of data structures contained 
within the dictionary. 

d. Extensibility Commands: vehicle to exploit the | 
extensibility facilities. 

€. Status-related Commands: the aye wom Cl Sel f= 
guish entities in different stages of the life cycle. 

ff. Security Commands: used to ailoxwx security decia- 
rations to be assigned to the dictionary. 

g. Dictionary Processing Control Commands: used to 
control the dicticnary process such aS Lemon Omen. , 
processing defaults, etc. 

ine Dictionary Administrator Commands: exclusive 
commands especially designed to be used by the dicticnary 


administrator. 
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Other facilities of a DDS exist ain the £ormeeee 


+b 
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Eridges cr interfaces to other systems. The contents of 


cr 
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dictionary may be made available as part of the processin 
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functicns of that systen. In each case, these systems 
provide tools whose functionality is outside the DDS bat 
which reguire data atkout entities which can he expected to 
exist in the dictionarw By accessing a dicticnary Waa. 
extracting the reguired information, the disadvantages of 
having te store and maintain redundant data is eliminated. 

Those such interfaces are : 

a. Report and Query System: the ability to SW@3¢ome 
various kinds of reports and queries. 

br. Validation Criteria: support other systems by 
Froviding a module which performs the specified validatior 
and which can be inserted into a program. 

c. Database Design: the ability to previde aaa. 
data needed by automatic database designers. 

d. Test Data Generation: support test data genera- 
tion Ly providing descriptions of the structures and rormaee 


of the files and datatases. 


As mentioned before, the term DDS security is 
applied here to denote the security of the DDS Zeseme 
Entities in the dictionary may have attributes describing 
access characteristics to the real-world instances of these 
entities, but this data is entirely informational in nature 
and cannot be enforced by the DDS, since the system is not 
part of the loop in the execution of programs agains "ae 
"real data". On the other hand, unauthorized access tc the 
DDS can te eliminated by applying such security procedures, 


e.g., the assignment of passwords, or the inclusien wom 
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Soupity stevels whiten control! the various kinds of access to 
@mGevonadry entities. TO gain more integrity and reliablity, 
mmensecurity Of the TPS must be considered to be related to 
the security of the entire computer systen. The level of 
security existing in the computer system is influenced by 
the security of the kasic systems software and the physical 
security of the installation, as well as the procedures sed 
by the personnel of the installation. These latter are cften 
Pomeroy jax, and at is not at ail unusual to observe cases 
where passwords are not kept confidential and may, indeed, 


openly be shared with unauthorized people. 


D. COST/BENEFIT ANALYSIS FOR DDS 


As applied to any system acquisition, Cost/senefit anal- 
Peis should be done prior to and in order to get a justifi- 
SatriOn £or DDS acquisition, implementation, and usage. fhe 
following list of costs and savings represent tangible items 
that can be used in assessing the costs and penefits for an 


economic study of the feasibility of implementing a DDS. 
1. Costs 


There are eight possible costs which may ke consid- 
ered: 

a. Acgquisiticn cost is the accumulation of lease or 
purchase cost and the maintenance cost of the systen. 

paevata Admenistratroenm staff Mcost@is self exriana- 


tory. 
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Cc. Hardware Cost is the sum of storage device 
and CPU time cost. 

Geeotabt-up cost 1s the tetal most of training data 
administrator staff and all activities such as developing a 
comprehensive plan (see Table I as an example) that should 


be done in any data dictionary implementation. 
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A BiGke 
Comprehensive Plan for DDS's Usage 


1. CLevelopment of a policy for the use of the DDS. 


2. Development of standards to ve followed in the 
dictionary, including hamting conventie 1s soe 
dictionary entities. 


Development of decisions on how to use the controi 
facilities of the DDS, such as the status and 
security facilities. 


Delegation of authority and responsibility for the 
use of various DDS facilities. 


5. Definition of frocedures for the use of the DDS 
and development of the reyuired policies to inple- 
ment these procedures. 


6. Design and implementation of customized features 
for the DDS, should aT | Benne aoe, ees tes 
include change to the dictionary schema to allew 
new types of information resources to be stored 
in the dictionary, production of speewaewzca 
reports, or interfaces to other S/W systems. 
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— Data collection cost is a function o£ the numees 
of entities, attributes, and relationships which are to be 


Put in, the dictienary. 


tty 


f. Maintenace cost will depend on the degree o 
changes to the application system or systems contrclied Ev 
Ene DES. 

g. Application system change cost is a cost perti- 
nent to any change to the application systems due to the 
implementation of the dictionary for reasons of efficiency, 
integrity, and mMaintainabyyity. 

he. User education cost 1S the cost £or trainings 
Feople involved in data dicticnary usage in additicn tc the 


data wadmamistrator sstatt 
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2. Factors for Estimating Savings and Benefits 


There are four ‘factors which can be used as an aid 
in estimating the quantification of savinys. The greater the 
Gegree to which these four factors are held to apply to the 
enterprise and its operations, the more the high eénd of 
estimated saving and renefits can be expected. Those four 


factors are: 


iced CULTEY of an Information Processinc 


Environment 


This is a major factor in the benefits that car 
Fe attained with a DDS. Increased maturity will help 
substantially in the integraticn of dictionary facilities 


into the operations of the enterprise. 
r. The Complexity of the Environment 


The number of data elements, files, databases, 
and programs can be used to measure how complex the informa- 
tion processing environment is. Problems caused by 
complexity tend to worsen geometrically with the number of 
such elements, files, databases, and programs. The value of 


a DDS will te greater as complexity increases. 
c. The Degree of Data Sharing 


It should be common practice that data elements 
are shared by dafferent programs, where some of these are 
from different systems. An important issue is that changes 
in one part of the system tend to have effects in many cther 
parts of the system. Failure to compensate for such ckanges 
Can cause producticn failures and unanticipated costs. 
Tracking the effect of these changes is a valuable feature 
of a DDS. It is made possible by evaluating the attrikutes 
and relationships of changed entities as the basis for 


faetner tracking. 
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ads Personnel Turnovers 


The personnel associated with information 
processing systems is either data processing organization 
perscnnel or user's organization personnel used to deal with 
the information processing systen. In this regard, a ome 
offers two advantages: First, information which cthemaeuas 
might be stored in the minds of individuals and which may be 
lost to the organization with the loss of the individual is 
now piaced in the DDS. Secondly, the learnii ng curve for new 


personnel is steeper than it would be without the use of a 
DDS. 


There are five areas in which savings and benefits 
may be expected. The four factors mentioned in the previcus 
secticn can be used as aids in predicting specific monetary 
Savings in each of these following areas: 

a. System design and, development: the prime advan- 
tage of the DDS is in its use for better communication 


ketween users and implementors. This results in few 


Vy 
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© 
changes or iterations and consequently faster progranmnin 
nie 


fu 


bkecause the specification is better documented a 
understocd by all parties. 

rk. System maintenance: better and more complete 
documentation in the dictionary, the abilaty (to (anaiy 2 ae 
effect of proposed changes, and the improved communication 
ketween users and maintenance programmers on roresed 
changes or corrections 

Cc. Data redundancy: reduction of unplanned data 


redundancy will result in an improved system which has 


greater integrity and better operabiiity as well as 
potentially decreased requirements for random storage 
devices. 
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d. Database creation: may take advantage of the 
descriptions contained within the DDS to reduce iterations 
in the design process and faster concurrence by all parties 
on the contents of a database. 


e. Improved Cemmunication: self explanatory. 


Sy, 
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A. GENERAL 


As mentioned before, the initial design cf sata 
Dictionary will be limited to four application apeqee 
personnel, payroll, intelligence personnel, and territorial 
personnel. This design is a first step in the Stage 
Develcrment approach used, therefore it may be expanded in 


the future. 


Be. DATA DICTIONARY SCHEMA/SUBSCHEMA 


In @ manner analogous to the context of a DBAS) page 
Dictionary Schema denctes the logical structure of the data 
qlet2 onary (DD). In this regard, then, the term Subschema 
will dencte a subset of the schema to be seen by a given 
applicaticn (process) or user [Ref. 6], and it 1s ccecmpatible 


With aprlication views of a database [Ref. 7}. 


The structural characteristics of the DD and the 
contents of the DD schema are important aspects of the usag 
OF; ta MEDS: Since by evaluating these, users may knew what 
kinds of meta-data and relationships between them exist 
within the DD. As suggested by Lefkovits et al [Ref. 8], 
the structural characteristics of a DD may be described in 
logical terms in order to gain a clearer insight of what 
kinds of meta-data are supported by the DDS. The logieam 
structure of a typical DD drawn by Allen et al { Ref: Sjeseen 
to Fe appropriate for DISPULLAHTAD®S DD Sevier, minor 


changes, and it is presented as figure 4.1. 
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Figure 4.1 Logical Structure of DISPULLAHTAD's DD 
(adapted from Allen et al [Ref. 5]). 


There are three kinds of entities (from the left to 


the right) within the logical DD in figure 4.1: 


a. Data Entities 


These consist of database, subschema, relaticn- 
ship, file, group of elements, and data element entities. 
Wherein é€ach record may consists of some elements, but nay 
or may not have group of elements in it. File and/or reeom 
entity is the subject of a process entity, while Gata 


element entity is the subject of a transaction or’a report 


ent it va 
Es Process Frtiteres 
A process entity may be an application rercecran, 
a program nodule, cr a system/subsystem that typicaily 


generates a report or does a transaction involving either 


data element or a group of data elements. 
c. Usage Entities 


Included in this category are user and their 
organizational environment (such aS processors and ternri- 
nals), and data communication environment (such as 


communication network, communication nodes, messages, eic.). 
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Since there are four applications inciuced in this 
initial design, there might be four subschemas accordingly. 
Due to the relatively small amount of meta-data that will be 
stored in the initial dictionary, however, and in cr@emeee 
reduce ccmplexityv, it 1s better not to apply subschemas at 
Pies. “poate. Later, if the dictionary has grown subsvtage 
tially, and many applications have been added, the 
subschemas may be applied in order to improve efficiency and 


security. 
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C. DISPULLAHTAD'S DATA DICTIONARY 


The design of this DD is based on current applications 
for which the DBMS has not been implemented yet. The 
following tables present the lists of Data, Process, and 
Usage entities and one or two of the actual instances of 


DISPULLAHTAD's data cictionary. 


1. Entities 


a. Data Entities 


Table II summarizes data entities abstracted 
from the personnel aprlication [Ref. 9 and Ref. 10]. Takle 
III presents data entities for the payroll application [ fet. 
ffeeand Ref. Za. Tata entities used in tne intellicerce 
personnel application is presented as Table IV {[Ref. 13 ane 
Ref. 14). Pim yp, meodta entities from the territorial 


perscnnel application are shown in Table V [Ref. 15]. 
bk. Process Entities 


Process entities belonging to the four applica- 
tions are presented as Table VI. Mies table ~Aists only 
routine processes, not irregular processes, since the’ latter 


ones are not yet standardized (they are done as requested). 
c. Usage Entities 


Table VII presents user entities for the fcur 


ap. ications. 
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The relationships can be represented Ly the 
follcwing relations: . 
RELATIONSHIPS (relation_nane, key, Cole eee 
othe 


r_entity_names) 


ae Relation Kame 


In pre-DEMS situation, this mav be filled with 
the reccrd entity name to represent the relationshifs 
ketween element data entities. In the case where a DBS has 
keen irelemented, it should be filled with the relation name 
since a lcgical reccrd may consist of some relations in 
order to reduce complexity and/or fulfill the Tive neemag 


EO MS, 
Fe. Key 


This is arn entity name that is used as the key 
for both storing and accessing the relation. If the reiation 
useS composite keys, this will be the first part of the 
composite keys where the second part will be stored as 
composite-kev attribute. 


c. Composite Key 


This 1s the entity name used as the S€ccn2)paee 
of a composite key. If primary key 1S not composite, this 
attrikute will be tilled with “NONae 


d. Secondary Key 


This is the entity name used as the secondary 
key. If the relaticn has no secondary key, this attribute 
will be filled with "NONE 


oS 


Ge eOener Eatity Naties 


The names of other entities (except keys) in the 
Eeomauvenewils be tilled by this attribute. An ampersand sign 
(& ) will be used Eketween two entity names and a sentence 
of "REPETITIONS OF" will be written in front of a repeating 


group of elements. 


Dee Attrlowces 


Every entity has information attached to it cailed 
attrikutes. In the fcllowing, all information that zay be 


included as attributes will be discussed. 
ae Data Entities 


There will be several pieces of inticriatior 
included for each data entity (either file, record, OG 
element entity). Since the DD typically be implemented as 
one of the DBMS application, these can be represented as a 
relation of: 

FILE_ENTITY (entity_name,block_size,access_method,logical_ 
: record_size, physical_storage_device) 
RECORD_ENTITY (entity name, length, fixed_variable_code,key, 


composite_key,secondary_key 
updating time, updating tode§ 


ELEAENT_ENTITY (entity name, length,code,source,user, 
k definition) 


(1) Entity Name. 
The name of the entity is limited to a 
Maximum of eight characters, this may be a combinaticn of 
alphaketic and numeric, but should contain an alphabetic as 
ies £16rst character. 
(2) Block Size. 
Self explanatory. 


De 


(3) Access 4ethod. 


This may be encoded as: 


S$ -= Sequential 
I - Indexed Sequential 
CL = )Dlbect heeess 
Y= Virtual Stortage@ie rs 
(4) Logical Record Size. 
This is egual to record length. 
(5) Physical Storage Device. 
This may be coded as the following: 
TAPE - Magnetic Tape 
DISK = Masmetic Disk 
DRUM - Magnetic Drun 
FICP - Fioppy Disk / Diskette 
CARD - Punched Card 
PAER = (Papers tape 


(6) Fixed ¢ Variable Code. 


The codes: used are: 


FIX =~ Faxed Lengenekecord 
VAR - Variable Length Record 
(7) Key. 


This iS an entity name used as the key for 
Froth storing and accessing the relation. If the relation 
uses composite Keys, this will be the first part of the 
composite keys where the second part will be stored as 
compcsite-key attribute. 

(8) Comyesite Key. 

This is the entity name used as the second 
part of Composite key. If a primary key 1s not composite, 
this attribute will be filled with “None 

(9)  Secendary Key. 

This is the entity name used as the seccn- 
dary key. If the record has no secondarv key, t hans 
attribute will be filled with one 
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(10) Updating Tine. 

Tekemaeer monte Contains a descriotion 
about how often this record will be updated. This wiil bea 
number of days. 

(11) Updating Mode. 

This attribute contains description about 

how the updating is done. It is encoded as follows: 
BATCH - Batch Processing 
ONLIN 


BOTH - Both of Batch and Online 


Online Processing 


(12) Entity Length. 

This denotes the length of an entity (may 
ke record, or element). In the case of variable record, 
this information will be filled with zeroes, since the 
length of each record will be attached within the record 
itself. 

(13) Entity Code. 


This is a code for the character type: 


A - Alphabetic 
N - WNumeric 
AN - Aphanumeric. 


(14) Source. 
Thais denotes the organizational entity 
Mesponsible f0r previding, updating, ana deleting the 
entity. 
(15) Users. 
This denotes the organizational entity 
(entities) allowed to retrieve and use the entity. et 


subschemas are applied to the DD, this attribute will not be 


necessary. 
(16) Definition. 
This provides a detailed harrative 
describing the entity. This may include the information 


about frequency_of_update, range_of_acceptable values, and 


=O On. 
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Clit} A varase 
At the present tine, every entity rame 
(especially element entity) has no alias attribute attached 
to. In the future, this attribute sure will be needed. One 
way to accomodate this need is by add another relation 
called alias relation in which has at least two attribupes 


entity-name and its alias-name. 
k. Process Entities 


Tne information included as attributes in the 
process entity are: hame, input_entity, output_entity mage 
the description of the process. These can be represented as 
the srelativon of; 

PROC ESS (process _nane, input entity, abt See ae ae 

jeaer of oatpac, output media) ~ 
---- key --- 
But, since a process may have more than one of either iGmpae 
or output entity, this reiation should have a composite keys 
rather than having only the name aS itS Singie key. Because 
only one input/output entity is allowed in every instance of 
PROt@ESo (Cer. wo), processes having more than one input/ 
output entity may waste storage since all attributes will 
appear unnecessarily more than once. In the case that this 
relation has a compcsite key, a query aSking which data 
entities are input (or cutput). tovaweryesn process alse poss= 
esses a difficulty since this relation can not te retrieved 
uSing only the process _ name {it must be retrieved using its 
composite key, instead). In erder to make a better vayoum 
those attributes may be arranged uSing the following three 
relatione: 
PROCESS (process_name,description) 
_ = =~n key _——— 
PROCESS_INPOT (process_namwe;1 DpUt en eee 


Se ae oe coe ee ee key se 
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See eee me Out pat ene y eer ape ee 
media 
—_ << a a a oe key —_ <> = = a a a 


Precess name will be limited to maximum of 
eight characters as applied to data entities. 
(2) Description of Process. 
Self explanatory. 
(3) Input Data. 

The input data may be either ae file, 
record, group of elements, data element, or data entered 
from console. 

(4) Description of Input. 

This attribute may be filled with a 
description such as the input data is "sorted by RANK" for 
instance. 

(5) Input Media. 

As to physical storage device attritute, 

this attribute may be filled with input storage device, or 


data entered via console. 


TAPE —Magmetic Pape 

Dist Nagnetrce: Drs k 

DRUM - Magnetic Drum 

FLCP - Floppy Disk / Diskette 
CAnee— SPunched Card 

Poin Soar late 

CCNS - Data Entered via Console 


(6) Output Data. 
The output data may be magnetic-data, 
displayed data, or printed material. 
(7) Description of Output. 
This attribute may be filled with a 
Gescription such as the output data is "sorted by KOTAMA" 


for instance, 
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(8) Input Media. 
This attribute may be filied with ote 
Storage device or printed outout material and these are 


encoded as follows: 


TAFE ~ Magnetic Tape 

DISK = Piagnet tomlin. 

DEUM Sh agnet ie sr am 

FIOP - Floppy Disk / Diskette 
CARD - Punched Card 

PAFR - Paper Tape 

PEIN = Pranted Macerwam 


c. Usage Entities 


The information pertinent to these entities have 
keen included and can be derived from data entities in terms 
of who is responsible for update operations andwho is 
allowed to retrieve a data entity. Here, this infcrmation 
will ke stated again from the reverse point of view, that 
1s, which data are the responsibility of this entity, and 
which data are allowed to be retrieved by this entity. 
Informaticn that will be included as the usage entity's 


attributes. are: user_name, description of the user, 
entity_name, and type_of_access. For reasors similar to 
those discussed concerning the process entities, Since a 


given user may have more than one data-entity as its resfo- 
sibility and/or to be retrieved to, these attributes nay be 


represented by the fcllowing three relations: 


USER (Us ee ean aoc e aa 
eC —_ —_ 


USER_ACCESS (User nae eee 
> oe ee eee ee ey —— oe oe ee 


USER_RESPONSIBILITY(user_name,entity_name) 


SE ae ey ae eee 
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S with the data entity and process entity 
names, the usage entity name will be limited to a raxinum of 
€ight characters, toc. 

(2) Entity Name. 
Entity_name here means a data_entity_ fame. 
(3) User Responsibility. 

iprsmiaymwocmce Listy QGmpra le, record, i group 
of element, or data eiement entities that are this user's 
responsibility. This can be viewed as subschema. 

(4) User Access. 

Toicatayeoewa Listwomslle, record, * group 
of element, or data element entities that may Fe retrieve Ly 
this user. This can be viewed as subschema, and may cr may 
not be same as the list of user responsibility items. 

(5) Zype of Access. 


The type_of_access is either: 


Re - Read only 
0 - Update only 
Bee) bOun R and v 
io) NOraccess 


Selif explanatory. 
c. Summary cf Relations 


Table VII i summarizes the relations Ou 
relationships, data entities, process entities, and usage 


entities. 
€. Example cf Relations 


Table IX presents examples of file, record, and 
element entity relaticns. Examples of process entity rela- 
tions and user entity relations are presented as Table X. 
Finally, an instance of the relationship relation is shown 
in Takle XI. 
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4. Example of Data Dmer onan woe 


tMeta-data contained within data dictionary may be 
used to answer guesticns asked by top managers, users, or 
technical staff (such as system analysts, programmers, and 
operators). In the following, several queries and the 


corresyonding responses will Le presented. 
ae Top Management Queries 


TOp manacement may ask a guestion like: "How 
often is the personnel masterfile updated ?" 
Possible answer is: 


PERSFILE is updated once every 30 days. 
bk. User Queries 


The user responsible for personnel management 
may ask the following guestion: "I need a list of perscnrel 
having rank of captain who speak French fluently and are 
experienced in the intelligence field. Is DISPULLAHTAD able 
to prcovide these data ?". 

Possible answer is: 

See entities: 
1. PANGKAT (rank) in relation PERSINTL 
2. ASING (foreign language) in relation PERSING 
3. AKPASING (active/passive code) in relation PERSINTL 


c. Technical Staff Queries 


"What are the inputs and/or outputs of process 
DPPS17 ?", is one possible question asked by an operator for 
instance. 
Possible answer is: 
Process DPPS17 
Input is/are: 
1. DAPOKDPP (sorted by DPP ), media: TAPE 


Output is/are: 


v2 


1. DAPOKDPP (sorted by KOTAMA ), media: TAPE 
2. KEKPANGN (sorted by KOTAMA ), media: DISK 


d. Unanticifated Queries 


One of the advantages of designing the 
dictionary using the relational model is it can accommcdate 
unanticipated gueries. As one of the DBMS's application, 
this dictionary Supports any query expressible in a 


relaticnal guery language (e.g. SEQUEL). 


as 


Ve. THE IMPLEMENTATION OF DATABASE Al DISEUR TAA 


Ae DATA VICTIONARY DESIGN AS A STEPPING =Si0 


The igplementaticn of a database at DISPOULLAHTAD may 
take advantage of the design of Data Dictionary in the 
preceding chapter in manv ways: 

e The designed DU mav be used as the first D3MS applica 
ome Then, the experience gained here can be used in the 
future application of the real database implementation. 

°* DD aS a repository or all meta-data will provides 
full specification and description of ail entities and rela- 
tionships between them. Given this, the implementaticn of 
CBMS wiii re faster due to fewer changes and iterations in 
database development. 

* In the case where an automatic database design tool is 
used, such as DATA LESIGNER, it may be interfaced with the 
DDS in order to take an advantage of the DDS contents. The 
database designer may benefit from the full descripticns of 
each entity contained within the data dictaonanye 

in? \tiws Ped ain, the design of DD for Curbem 
DISPULLAPTAD applications can be considered as a stefring- 


stone to the DBMS iaplementation. 


B. DISFULLAHTADSS DATABASE DESZGN 


From Tables II through V, it can be seen that there are 
Many data element redundancies within the four applications. 
These redundancies reed to be eliminated by designing a 
Catarase fulfilling the five normal forms. One fossiktle 
method is by gathering the common data elements in one rela- 
tion, and other specific data pertinent to each of the four 


applications in separate relations. The specific data model 
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EQRMCdcHumEd Ee teation Must tikhely = "consist of more thar cne 
relation. 

In designing the database, availability of data descrip- 
tion may be exploited to make this work easier. For example, 
suppose the following relationship is designed in the 
personnel database: 


MAINPERS (nopers,nama, pangkat,corps, jakatan,satminkl) 
=> = ey= ie 


Here, the possible descriptions contained within the 
dictionary that may re extracted are: 

e Which files and records would have to be accessed in 
crder to establish an instance of this relation ? 

e What is/are tke key/composite keys of each records 
derived from the preceding query ? 

e What is the length of each of those entities ? 

Furthermore, in crder to satisfy the five normal fcrms a 
full and clear description of each entity is needed. these 
descriptions are contained within the data dictionary. The 
issue of actual datakase design is beyond the scope of this 


thesis and is left fcr possible follow-on thesis work. 
Mee cHOCSING THE DATA DICTIONARY SYSTEM (DDS) 
1. Features 


The available commercial DDS have most of the 


following features (see Figure 5.2 for more detail). 
a. Dictionary Schema 


This is a feature used to generate a manufactur- 
ex's standard schema, Such aS entities, relationshirfs, and 


aimerarutes. 


tS 


Ek. Sehema Extensmpr dis. 


This is a feature wherepy an installation is 
able to customize the manufacturer's standard schema by 
adding to it new entity-types, relationship—-— gee, and 


attrikute-types. 
Cc. Dictionary Marneenance 


This is a feature that enables an installation 


to create and Maintain 1tS data dictvorae 
d. Reports and Queries 


An installation may generate reports using meta- 
data contained within the data dictionary. A DDS previegee 


these abilities via these features. 
e. Bridge/Interface Facility 


This feature generates descriptions from the 
data dictionary needed by other systems, typically wae 


application development tool such as DATA DESIGNER. 
£. Program Access Facility 


This enables an installation to extend the fuxnc= 
tionality of a DDS, i1.e., the preparation of progranseaeee 


to access data dictionary Contents. 
Gs sotatus fF acer iacy 


This feature provides the status of each entity, 
especially when the data dictionary is used in the systen 


life cycle, by providing aids in application development. 


he | Secueity (facie, 


An installation may restrict access to the data 
dictionary to authorized personnel only. This requirement is 
fulfilled by this feature. 
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a. bata Dictionary Features 


There are five data dictionary features required 
in order tc implement DISPULLAHTAD's data dictionary. These 
ane: 

e Dictionary Schema which is needed in order to 
generate the entities, relationships, and attributes. 

¢ Schema Extensibility which will be needed to 
accommodate specific needs that cannot be fulfilled bv the 
@hetLicnary Schema. 

e Dictionary Maintenance which is used te create 
and maintain the data dictionary. 

e Reports and Queries which are required in 
order to generate rerforts from and to extract data contained 
Michan data dictionary. 

e Security Facility, even though the data 
dictionary has no actual instances of "secured" data, it 
will contain information about such data that can be used to 
access it, i.e., entity_name, access _ method, where such data 
are stored, etc. Therefore, a security facility is reguired 
to add and strengten the level of security. 

Other features are optional. These can be 


considered as "nice tc have". 
b. Active Versus Passive Data Dictionary 


A full-active or partially-active system is very 
desirable because it provides features such as enforcing 
Standards, range_of_ value auditing, transaction monitoring, 
etc. On the other hand, an active system possesses much 
overhead, suffers in terms of longer turn-around time, and 
requires more complex processing algorithms. Therefore, at 
this time, a passive data dictionary system is more appro- 
beaate for DISPULLAHTAD. In the future, after enough 


qe 


experience has been gained, a partial-active system may be 
applied in order to take fuller advantage 9£ the data 


die vena y. 
c. Free-standing Versus D3MS-dependent Systen 


A free-standing system is very appropriate for 
an installation having different DBMNSs. This may happen in 
an installation with databases using network or hierap@meems 
structures in conjuction with newer technology such as the 
relational system. In this case, all systems may access the 
data dictionary inderendently, since the usage of the data 
dictionary 1s not iimited to any one systen. 

On the cther had, a data dictionary may be 
implemented as one cf the DBMS applications (some DDSs are 
implemented in this ganner) [ Ret. 8}. This approachueme 
appropriate for an installation implementing databases using 
a Single Bas, and DISPULLAHTAD £alls into this categoue 
Therefore, a DBMS-~derendent system will provide more advan- 
tages for DISPULLAHTAL, e.g., it can be used aS a trainee 
tool in implementing the database. Another possinle advan- 
tage is that if DISPULLAHTAD should change from a passive to 
an active system, there will not be too many modifications 
reguired because the data dictionary and the dataktases are 
already compatible. 


d. Make or Euy 


Buying an available commercial System may 
provide a high guality and ready-to-use system. Figure 5.2 
Summarizes features of current commercial Svstems that aay 
be used in choosing the best system fulfilling the required 
features. A commercial system used by many installaticns 
without much trouble may be an indication that it has a 
Certaime que liv. Criteria listed in Figure 5.1 may be used 
in selecting the best system for DISPULLAHTAD. 


ao 


| 
| 
| 
| 
g 


1. It should have at least the following five features 
that may be considered as the primary criteria: 


| aswuretr1Oonaby SGnena. 
Be sehema Extensibidity. 
| Cee tctlOna my eialntenance. 
| d. Reports and Queries. 
| eam cecirity gacilicy. 


Pe ne tOllOWiNG sicur Criteria may be considered as 
the secondary criteria: 


ae Compatible with current hardware. 
De GONDatible with applied OBMS. 
High quality assurance. 
Low ACquisition and Set-up Costs. 


(ng cers rare Ae eg ED EEO, CE AD DO ID AY mt cronatlh eee Stan 


Figure 5.1 Criteria for Choosing 
Commercial Data Dictionary Systen. 


On the other hand, one significant advantage of 
designing a DDS in-house is that the system can be fitted to 
specific requirement of the installation. rurthermoce, 
given that DISPULLAHTAD will implement a DBMS, the rela- 
tional dictionary presented in the previous chapter may be a 
good candidate for the first application. By implementing 
the dictionary as a relational database, only one systen 
heeds to be acquired (DBMS) instead of two (DBMS and DDS). 


3. Reccmmendaticn 


For the reascns discussed in the previous secticn, 
it is tetter for DISOULLAHTAD to implement the data 
dictionary model described in the previous chapter as the 
Priest application of the DBMS. The dictionary will bea 
DBMS-dependent systen. Initially, the system should be 
passive, and later, if appropriate, it may be changed to an 


active system. 
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VI. CONCLUSION 

Mipeeneneimg. a LENS sea "must™ Lor DEISPULLAHTAS in 
mmereto CORtrol the proliferation of itS applications that 
in turn raises problems of data redundancy and data incon- 
Srstency. Pratarily, a DBMS provides data manipulation 
capabilities whereas a data dictionary provides management 
and control. Applying management and control (by imple- 
menting DDS) first will make the job of database design and 
implementation easier in term of lessening the difficulties 
and the time and effort required to develop databases. ie 
this regard, designing a data dictionary may be considered 
as a stepping-stone tc the implementation of a DBMS. 

This thesis has presented a relational model of a 
dicticnary which can satisfy the needs of DISPULLAHTAD. The 
advantages of this mcdel are: 

1) Pets ecomlparkole ss with amy relational DBMS that 
DISPULLAHTAD may procure. 

2) it obviates the need to buy a DDS in addition to the 
BeMS. 

3) it can be tailored specifically to DISPULLAHTAD's 
needs because of the flexibility of the relational model. 

4) it satisfies the criteria for a DDS (see Figure 5.1) 

In crder to attain the objective of manaying and 
controlling data rescurces, the foliowing implementation 
policy is recommended: 

e All personnel involved in software development (such 
as system analysts, frogrammers, etc.) should use the data 
dictionary extensively in doing their jcbs. 

e Only the data administrator staff may update the data 


dictionary, others may access it in read-only node. 
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e All suggestions concerning the data dictionary ma ee 
addressed to the data administrator staff. 

This thesis has stopped short of suggesting a database 
design for DISPULLAHTAD'S personnel application. Follcw-on 
work could be done using the dictionary model suggestei 


herein as a foundation. 
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